G84 vs. RV630:
GeForce 8600 GT is based on Nvidia’s G84 graphics processing unit – the same GPU that’s used by the GeForce 8600 GTS, which we covered back in April. To save repeating ourselves, if you want a full run-down on what G84’s capabilities are, we suggest you read our
analysis of the chip’s architecture.
In terms of raw shader power, G84’s architecture looks pale in comparison to AMD’s RV630 architecture, as the chip only features 32 1D scalar stream processors clocked at 1180MHz compared to the 24 5D shader processors (120 stream processors) clocked at 800MHz in RV630.
When you tally up the raw compute power, RV630 weighs in at 192 GigaFLOPS, while G84 is left trailing at just 113.3 GigaFLOPS. Of course, there’s more to it than what meets the eye and there are many factors that mean these two chips line up with one another.
Compared to G80, Nvidia improved the chip’s texturing capabilities from four texture addresses per clock, per shader cluster, to eight
texture addresses per clock. The chip’s texture filtering capabilities haven’t changed on a per-shader-cluster scale though, meaning the chip’s able to do a total of 16 bilinear texture filter operations per clock cycle – this is double what RV630 can do clock for clock.
Nvidia has advantages when it comes to pixel outputting too, with double the number of pixels per clock, as G84 has two ROP partitions that can each output four pixels per clock – RV630 has just the one ROP partition. However, it’s worth noting here that there is a clockspeed difference between the ROPs – G84’s ROPs run at 540MHz on the GeForce 8600 GT, while RV630’s ROPs run at 800MHz on the HD 2600 XT. This results in pixel fillrates of 4320 Mpixels/sec and 3200 Mpixels/sec respectively.
When it comes to Z-only processing, G84 really takes RV630 to the cleaners, as it's able to operate at 32 pixels per clock (four times normal speed). On the other hand, RV630 can only process Z-only samples at eight pixels per clock (double speed).
Memory Controllers:
Both RV630 and G84 make use of a 128-bit memory interface, although implementations are slightly different – we’ll quickly highlight the differences here.
G84 features two 64-bit memory channels, with each channel assigned to one ROP partition. Since the ROP partitions are completely decoupled from the stream two stream processor clusters the throughput of pixels is load balanced by a fragment crossbar – this helps to ensure that all ROPs are active.
RV630 uses a cut-down version of R600’s ring bus memory controller, which has three ring stops (two 64-bit memory channels and one ring stop for PCI-Express) and a bi-directional internal 256-bit bus (128-bit read, 128-bit write). Additionally, the memory controller is fully-distributed around the chip with an independent DMA unit that manages the three ring stops on the bus.
Want to comment? Please log in.